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A Quantile Implementation of the EM Algorithm and Its 
Applications to Parameter Estimation with Interval Data 



Abstract 



The expectation-maximization (EM) algoritimi is a powerful computational technique for 
^^ ' finding the maximum likelihood estimates for parametric models when the data are not fully 

Si^ ' observed. The EM is best suited for situations where the expectation in each E-step and the 

jrt ' maximization in each M-step are straightforward. A difficulty with the implementation of 

-(— > ' 
C/3 ' the EM algorithm is that each E-step requires the integration of the posterior log-likelihood 

function. This can be overcome by the Monte Carlo EM (MCEM) algorithm. This MCEM 

^ ^ uses a random sample to estimate the integral in each E-step. But this MCEM converges very 

■"^ I slowly to the true integral, which causes computational burden and instability. In this paper 

^^ , we present a quantile implementation of the expectation-maximization (QEM) algorithm. 



This proposed method shows a faster convergence and greater stability. The performance of 

m : 

("^ , the proposed method and its applications are numerically illustrated through Monte Carlo 

simulations and several examples. 



Keywords: EM algorithm, Grouped data. Incomplete data, Interval data. Maximum likelihood, 
MCEM, Missing data, Quantile. 



1 Introduction 

The analysis of lifetime or failure time data has been of considerable interest in many branches of 
statistical applications. In many experiments, censoring is very common due to inherent limita- 
tions, or time and cost considerations on experiments. The data are said to be censored when, for 
observations, only a lower or upper bound on lifetime is available. Thus, the problem of parameter 
estimation from censored samples is very important for real-data analysis. To obtain the parameter 
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estimate, some numerical optimization methods are required to find the MLE. However, ordinary 
numerical methods such as the Gauss-Seidel iterative method and the Newton-Raphson gradient 
method may be very ineffective for complicated likelihood functions and these methods can be 
sensitive to the choice of starting values used. 

For censored sample problems, several approximations of the MLE and the best linear unbiased 
estimate (BLUE) have been studied instead of direct calculation of the MLE. For exam ple, the 
proble m of parameter estimation from censored samples has been treated by several authors. iGupta 
(JI952I ) has studied the MLE and provided the BLUE for Type-I and Type-II censored samples from 



an normal distribution. 



Govindaraiulul (|l966l ) has derived the BLUE for a s ymmetrically Type-II 



censored sample from a Laplace distribution for sample size up to A^ = 20. 



BalakrishnanI (|l989t ) 



has given an appr o ximat ion of the MLE of the scale parameter of the Rayleigh distribution with 



censoring. 



SultanI (|l997l) has given an approximati on of the MLE for a Type-II censored sample 



from an normal distribution. 



BalakrishnanI (|l996f ) has given the BLUE for a Type-II censored 



sample fro m a Laplace distribu tion. The BLUE needs the coefhcients a^ and 5,, which were 



tabulated in 



BalakrishnanI (|1996l ). but the table is provided only for sample size up to iV = 20. In 



addition, the approximate MLE and the BLUE are not guaranteed to converge to the preferred 
MLE. The methods above are also restricted only to Type-I or Type-II (symmetric) censoring for 
sample size up to A^ = 20 only. 

These aforementioned deficiencies can be overcome by the EM algorithm. In many practical 
problems, however, the implementation of the ordinary EM algorithm is very difficult. Thus, 



Wei and Tanner 



1990al lbl) proposed to use the MCEM when the EM algorithm is not available. 
However, the MCEM algorithm presents a serious computational burden because in the E-step 
of the MCEM algorithm, Monte Carlo random sampling is used to obtain the expected posterior 
log-likelihood. Thus, it is natural to look for a better method. The proposed method using the 
quantile function instead of Monte Carlo random sampling has greater stability and also much 
faster convergence properties with smaller sample sizes. 

Moreover, in many experiments, more general incomplete observations are often encountered 
along with the fully observed data, where incompleteness arises due to censoring, grouping, quantal 
responses, etc. One general form of an incomplete observation is of interval form. That is, a lifetime 
of a subject Xi is specified as a^ < Xi < bi. In this paper, we deal with computing the MLE for 
this general form of incomplete data using the EM algorithm and its variants, MCEM and QEM. 
This interval form can handle right-censoring, left-censoring, quantal responses and fully-observed 



observations. The proposed method includes the aforementioned existing methods as a special 
case. This proposed method can also handle the data from intermittent inspection which are 
referred to as groupe d dat a which provide onl y the n umber of failures in each inspection period. 
Seo and YumI ( 1993 ) and IShapiro and Gulati ( 1998!) h ave given an approximation of the MLE 



under the exponential distribution only. 



Nelson 



1982 ) described maximum likelihood methods, 



but the MLE should be obtained by ordinary numerical methods. The proposed method enables 
us to obtain the MLE through the EM or QEM sequences under a variety of distribution models. 

The rest of the paper is organized as follows. In Section [21 we introduce the basic concept 
of the EM and MCEM algorithms. In Section [3J we present the quantile implementation of the 
EM algorithm. In Section 01 we provide the likelihood construction with interval data and its EM 
implementation issues. Section [5] deals with the parameter estimation procedure of exponential, 
normal, Laplace, Rayleigh, and WeibuU distributions with interval data. In order to compare the 
performance of the proposed method with the EM and MCEM methods, Monte Carlo simulation 
study is presented in Section [5] followed up with examples of various applications in Section [T] This 
paper ends with concluding remarks in Section [5] 



2 The EM and MCEM Algorithms 



In this section, we give a brief introduction of the EM and MCEM algorithms. The EM algo- 
rithm is a powerful computational technique for finding the MLE of parametric models when 
the re is no closed-form M LE, or the data are incomplete. The EM algorithm was introduced 
by [Dempster et ali (J1977I ) to ov ercome the above difficu lt ies. For more d etails ab o ut th is EM 
algorithm, good referen ces are iLittle and RubinI (|2002| ). iTanneiJ (|l996l ). ISchafeiJ (J1997I ). and 
Hunter and Langd ( 2004 ). 



When the closed-form MLE from the likelihood function is not available, numerical methods 
are required to find the maximizer (i.e., MLE). However, ordinary numerical methods such as the 
Gauss-Seidel iterative method and the Newton-Raphson gradient method may be very ineffective 
for complicated likelihood functions and these methods can be sensitive to the choice of starting 
values used. In particular, if the likelihood function is flat near its maximum, the methods will 
stop before reaching the maximizer. These potential problems can be overcome by using the EM 
algorithm. 



The EM algorithm consists of two iterative steps: (i) Expectation step (E-step) and (ii) Maxi- 
mization step (M-step). The advantage of the EM algorithm is that it solves a difficult incomplete- 
data problem by constructing two easy steps. The E-step of each iteration only needs to compute 
the conditional expectation of the log-likelihood with respect to the incomplete data given the 
observed data. The M-step of each iteration only needs to find the maximizer of this expected log- 
likelihood constructed in the E-step, which only involves handling "complete-data" log-likelihood 
function. Thus, the EM sequences repeatedly maximize the posterior log-likelihood function of 
the complete data given the incomplete data instead of maximizing the potentially complicated 
likelihood function of the incomplete data directly. An additional advantage of this method com- 
pared to other direct optimization techniques is that it is very simple and it converges reliably. In 
general, if it converges, it converges to a local maximum. Hence in the case of the unimodal and 
concave likelihood function, the EM sequences converge to the global maximizer from any start- 
ing value. We can employ this methodology for parameter estimation from interval data because 
interval data models are special cases of incomplete (missing) data models. 

Here, we give a brief introduction of the EM and MCEM algorithms. Denote the vector of 
unknown parameters hy 6 = {9i, ... ^9p). Then the complete-data likelihood is 

n 

1=1 

where x — (xi, . . . , x„) and we denote the observed part of x by y = (j/i, . . . , ym) and the incom- 
plete (missing) part by z = {zm+i, . • . , Zn). Denote the estimate at the s-th EM sequences by O^^K 
The EM algorithm consists of two distinct steps: 

• E-step: Compute Q(0|0('*^) 

where g(6/|6/(^') = /logi'=(0|y, z)p(z|y, 0'''))rfz. 

• M-step: Find O'-'^'^'^ 

which maximizes Q{6\6^'^') in 6. 



In some problems, the implementation of the E-step is difficult. IWei and Tanneij (|l990al lbf) 
propose to use the MCEM to avoid this difficulty. The E-step is approximated by using Monte 
Carlo integration. Simulating z^+i, . . . , z„ from the conditional distribution p(z|y, 6^'^'), we can 
approximate the expected posterior log-likelihood as follows: 

K 

K 



1 K 



k=l 



where z^'^) ~ (zm+i.^, . . . , Zn^k)- This method is called the Monte Carlo EM (MCEM) algorithm. 
Major drawback to MCEM is that it is very slow because it requires a large sample size in order 
to possess stable convergence properties. This problem can be overcome by the proposed method 
using the quantile function. 



3 The Quantile Implementation of the EM Algorithm 



The key idea underlying the quantile implementation of the EM algorithm c an be easily illu s trated 



Freireich et 



( 19631) 



by the following example. The data in the example were first presented by 

and have since then been used very frequently fo r illu stration in the reliabih ty engineering and 

survival analysis literature including iLeemisI (J1995I ) and ICox and Oaked (J1984J ). 



3.1 Example: Length of Remission of Leukemia Patients 

An experiment is conducted to determine the effect of a drug named 6-mercaptopurine (6-MP) on 
leukemia remission times. A sample of size n — 21 leukemia patients is treated with 6-MP and the 
times of remission are recorded. There are m ~ 9 individuals for whom the remission time is fully 
observed, and the remission times for the remaining 12 individuals are randomly censored on the 
right. Letting a plus (+) denote a censored observation, the remission times (in weeks) are 

6 6 6 6+7 9+10 10+ 11+ 13 16 
17+ 19+ 20+ 22 23 25+ 32+ 32+ 34+ 35+ 



Using an exponential model, we can obtain the complete likelihood function and the conditional 



pdf 



^ m -. n 

log L^(0|y, z) =-n log cr V y, V z,. 



a ^ — ^ a 

2 — 1 Z— 771+1 



p(z 



n n ^ 

y,^(^))= n v.AzM^'')= n 713)' 



i^m+l 



-{z,-Ri)lG^'^ 



Zr > Ri 



, ^i ^ J-i-i^ 



where Ri is a right-censoring time of test unit i. Using the above conditional pdf, we have the 
expected posterior log-hkehhood 

Q(a|a(^)) 
^ J\ogL-{a\y,z)p{z\y,e^'^)dz 

= -nlogcr y^yi V" ZiP:,.{zi\a^'''>)dzi 

a ^ — ' a ^ — ' J 

i—l i—m+l 

(s) 1 ™ 1 " 

= -nlog cr - (n- to) T^ 2/j T^ Ri- 

a a ^-^ a ^-^ 

i=l i— rn+1 

Then the Monte Carlo approximation of the expected posterior log-likelihood is given by 

-. m -. -. ii" n 

Q(cr|cr(")) = -n log cr- -^y,- -—^ ^ Zj,fc. 

z— 1 fc— 1 i—m+1 

where a random sample Zi^k is from ^^^(zilcr*^''^) = — ^e^'^^'^^'^Z'^ . In the Monte Carlo approxi- 
mation, the term J ZiPzi{zi\a^''')dzi is approximated by 

1 ^ 

•^ fc=i 

This approximation can be improved by using the quantile function. For the conditional pdf 
Pzi{zi^k\o''^^''), the quantiles of ^fe, denoted by qt^k, are given by 

q^.k = F~\^k\R^, 't(^)) = i?, - (t(^) l0g(l - Cfc), 

for < ^fe < 1. We can choose ^^ from any of the fractions, k/K, k/{K + 1), (fc — ^)/K, etc. 
Using the quantile function, we have the following approximation 

K 



/ z,p^^{z,\a'-''^)dz, ~ j^^q 



k=l 

It is noteworthy that that a random sample Zi^k in the Monte Carlo approximation is usually 
generated by using the above quantile function with ^k from a random sample having a uniform 
distribution between and 1. 

Fig.[T]presents the MCEM and QEM approximations of expected posterior log-likelihood func- 
tions for K — 10 (dashed curve), 100 (dotted curve) and 1000 (dot-dashed curve) at the first step 
(s — 1), along with the exact expected posterior log-likelihood (solid curve). The MCEM and 
QEM algorithms were run with starting value cr'-^' = 1. As can be seen in the figure, the MCEM 
and QEM both successfully converge to the expected posterior log-likelihood as K gets larger. 
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Figure 1: The expected posterior log-likelihood functions and approximations, (a) Monte Carlo 
approximations, (b) Quantile approximations. 

Note that the QEM is much closer to the true expected posterior log- likelihood for smaller values 
oiK. 

Fig. [2] displays the iterations of the EM and QEM sequences in the example from the starting 
value a'^'^^ = 1. The horizontal solid lines indicate the MLE {a — 39.89). The figures clearly show 
that the QEM is stable and converges very fast to the MLE. Even with very small quantile sizes, 
the QEM outperforms the MCEM. It should be noted that the QEM with K = 100 performs 
better than the MCEM with K = 10, 000. 

3.2 Convergence Properties of the MCEM and QEM Algorithms 



Another way to view the quantile implementation idea is by looking at the Riemann-Stieltjes 
integral. For simplicity of presentation, we consider the case where z is one-dimensional. Denote 
h{0,z) — logL'^{0\y,z). Let us consider a following Riemann-Stieltjes sum, 

K 

K 



l^Me,F-i(a)). 



fc=i 



In the limit as i^ — >■ oo, we have 



Jh{e,F-\0)dt 
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Figure 2: Successive parameter estimates using (a) the MCEM and (b) the QEM. The horizontal 
sohd hues indicate the MLE (ct = 39.89). 

Using a change-of- variable integration technique with z = F~^(£^), we have 

h{e,z)dF{z)^ h{e,z)f{z)dz. 



Hence the quantile approximation of the expectation posterior log-likelihood is a Riemann-Stieltjes 
sum which converges to the true integration. With ^^ = (fc — ^)/K, this sum is also known as the 
extended midpoint rule and it is accurate to the order of 0{1/K^); see 
is. 



Press et al. 



(12003). That 



/, K -. 

h{e,z)f{z)dz = -Y,h{e,qu) + o(-^ 

k=l 



where qk ^ F ^{£,k) 



On the other hand, the accuracy of the Monte Carlo approximation 

K 



h-K = j7^He, Zk), 



k=l 



(1) 



can be assessed as follows. By the central limit theorem, we have 

^Y&T[h[e,z)) 

and this is accurate to the order of Op (1). It is immediate from the weak law of large numbers 
that hK -^ E{h{6, Z)). Using this and ([1]), we have 

J hie, z)f(z)dz = ;^ E '^(^' ^^) + ^^(71 



k=l 



From this, it is clear that the QEM is much more accurate than the MCEM. 

We can generahze the above resuh as follows. In the E-step, we replace the Monte Carlo 
approximation with the quantile approximation 

1 ^ 

fc=i 

where q^") = (<7m+i,fc, • • • , <?n,fc) with g^^fe = F^^{^k\zi, 0^^^), and ^k is any fraction. In this paper, 
we use £,k = {k — \)/K. 

It should be noted that the approximation of the expected posterior log-likelihood in the pro- 
posed method can be viewed as being similar to a quasi-Monte Carlo approximation in the sense 
that the quasi-Monte Carlo approximat ion also uses a deterministic sequence rather than a ran- 



dom sample. In fact, iNiederreiteiJ (|l992l ) shows that there exist such sequences in the normalized 



integration domain, which ensure an ac curacy to the order of 0{K -'^(logX)'' ^), where d is the 



dimension of the integration space (see lRobert and Casellal Il999f) . Thus, using the quasi-Monte 



Carlo sequences in the normalized integration domain, one can improve the accuracy of the inte- 
gration in the E-step of the MCEM algorithm which leads to accuracy on the order of 0{K^^) 
with d — 1. However, we should point that the proposed QEM method leads to accuracy on the 
order of 0{K~^). Therefore, although using the quasi-Monte Carlo approximation can improve 
the MCEM, the inaccuracy in that case will be greater than that of the proposed QEM method. 
Also incorporating the quantiles from the proposed method into the M-step to obtain the MLE is 
quite straightforward. On the other hand, if the quasi-Monte Carlo sequences in the normalized 
integration domain are used, it would be irrelevant to the use of its sequences in the M-step to 
obtain the maximizer. Thus, the focus of the paper is to construct the EM algorithm using the 
quantiles so that the MLEs can be straightforwardly obtained. 



4 Likelihood Construction 

In this section, we develop the likelihood functions which can be conveniently used for the EM, 
MCEM and QEM algorithms. 

The general form of an incomplete observation is often of interval form. That is, the lifetime of 
a subject Xi may not be observed exactly, but is known to fall in an interval: a^ < Xi < bi. This 
interval form includes censored, grouped, quantal-response, and fully-observed observations. For 




example, a lifetime is left-censored when a.i — — cxd and a lifetime is right-censored when hi — oo. 
The lifetime is fully observed when ai = hi. 

Suppose that x — (a;i, . . . ,a;„) are observations on random variables which are independent 
and identically distributed and have a continuous distribution with the pdf f{x) and cdf F{x). 
Interval data from experiments can be conveniently represented by pairs {wi, Si) with Wi — [ai, hi], 

for i — 1, . . . ,n, 

where Si is an indicator variable and ai and hi are lower and upper ends of interval observations of 
test unit i, respectively. If a^ — hi, then the lifetime of the i-th test unit is fully observed. Denote 
the observed part of x = (xi, . . . , x„) by y = (j/i, . . . , ?/„) and the incomplete (missing) part by 
z — (z„i_|_i, . . . , z„) with ai < Zi < hi. Denote the vector of unknown distribution parameters 
hy 6 = (01, . . . ,9d)- Then ignoring a normalizing constant, we have the complete-data likelihood 
function 

n 

L'{e\y,z)^Y[f{x,\e). (2) 

1=1 

Integrating L^(0|x) with respect to z, we obtain the observed-data likelihood 

„ rn n 

L{d\y)^ I L^{e\y,z)dz = l[f{y,\e) [] {F{b,\e) - F{a,\9)}, 

2—1 j — 771+1 

where in general an empty product is taken to be one. Using the (wi, Si) notation, we have 

n 

L{e\w, ^) ex n f{m\ef' {F{b,\9) - Fia,\e)y-'\ (3) 

i=l 

where w = (wi, . . . , Wn) and S — ((5i, . . . , (5„). Here, although we provided the likelihood function 
for the interval-data case, it is e asily extended to m ore general forms of i ncomp lete data. For more 



details, the reader is referred to 



HeitianI (|l989l ) and lHeitjan and RubinI (|l990l ). 



Thus, the goal is inference about 6 given the complexity of the likelihood, and the EM algorithm 
is a tool that can be used to accomplish this goal. Then the issue here is how to implement the EM 
algorithm when there are interval data in the sample. By treating the interval data as incomplete 
(missing) data, it is possible to write the complete-data likelihood. This treatment allows one to 
fine the closed-form maximizer in the M-step. For convenience, assume that all the data are of 
interval form with Oi < Wi < hi and ai < hi. Then the likelihood function in ^ can be rewritten 
as 

n 

L(e\vj)^W{F(h\e)-F{a,\e)}. (4) 
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Then the complete-data hkehhood function corresponding to @ becomes 

m n 

L^{e\y,z)^Y[f{y,\e)- n f{z,\e), (5) 

2—1 ?'— rn+1 

where the pdf of Zi is given by 



F{h\e)~F{a,\e)' 
for ai < z < bi. Using this, we have the foUowing Q-function in the E-step, 






It is worth looking at the integration when b.^ — ^ Oj. For convenience, omitting the subject 
index i and letting b = a + e, we have 

r \og f{z\e)-p,(z\e^'^)dz. (6) 

J a 

It fohows from the integration by parts that the above integral becomes 

^iog/(z|0) • p.{z\e^'^)] l^^ - £^^ j^ ■ p.{z\e^^^)dz, 



(7) 



where 

P.(z|.(^)) = ^ --. 

Applying I'Hospital rule to (O with ([7]), we obtain 

lim r \og f{z\e)-PAz\e^'^)dz = log f{a\e). 

Hence, for full observation, we simply use the interval [ai, ai] notation which implies [ai, a^ + e] with 
the limit as e ^- 0. All kinds of data considered in this paper can be denoted by the interval-data 
form without the indicator variable. Si. 

For notational convenience, we let zi — yi, . . . ,z„i — ym- Then the complete-data likelihood 
function corresponding to ([5]) becomes 

n 

L^(0|z)(x[]/(z,|0), (8) 

1=1 

where z = (zi, Z2, . . . , z,i). From now, unless otherwise specified, z refers to (zi, Z2, . . . , Zn) instead 

of {z„i+l,Z2,.. .,Z„). 

Thus, we use (|3]) or ([S]) for the likelihood function or complete-data likelihood function instead 
of (m or ^, respectively. 
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For many distributions including Weibull and Laplace distributions, it is extremely difficult or 
may be impossible to implement the EM algorithm with interval data. This is because, during 
the E-step, the Q-function does not integrate easily and this causes computati onal difficulties in 



the M -step. In order to avoid this problem, one can use MCEM algorithm (jWei and Tanner 



1990al ) which reduces the difficulty in the E-step through the use of a Monte Carlo integration. 
As aforementioned, although it can make some problems tractable, the MCEM involves a serious 
computational burden and can often lead to unstable estimates. Thus, we proposed a quantile 
implementation of the EM algorithm which alleviates some of the computational burden of the 
MCEM and leads to more stable estimates. 

For stopping criterion for the EM, MCEM or QEM algorithm, the algorithm stops if the changes 
are all relatively small compared to a given precision e. As an example for normal distribution, 
the QEM algorithm stops if |^("+i) - ^(*) | < e/x(^+i' and |cr(^+i) - a'-'^ \ < ecr'^+i) as well. In what 
follows, we obtain the EM (if available), MCEM, and QEM sequences for a variety of distributions, 
which maximize the likelihood function in ([3]). 



5 Parameter Estimation 

In this section, we briefly provide the inferential procedure for the parameter estimation of ex- 
ponential, normal, Laplace, Rayleigh, and Weibull distributions from random samples in interval 
form. For the exponential and normal distributions, the ordinary EM algorithm applies, so the 
MCEM and QEM are not needed. To compare the performance of the MCEM and QEM, how- 
ever, we include the exponential and normal distributions although the ordinary EM algorithms 
are available. For the Laplace distribution, the computation of the E-step is very complex, so 
either the MCEM or the QEM is more appropriate. For the Rayleigh and Weibull distributions, 
the calculation of the integration in the E-step does not have a closed form. So, it is not feasible 
to use the ordinary EM algorithm. As aforementioned, it is noteworthy that the QEM sequences 
are easily obtained by replacing a random sample z^'^' in the MCEM sequences with a quantile 
sample q'^'"'^. 
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5.1 Exponential Distribution 

We assume that Z^ areiidexponentialrandomvariables withthepdf givenby /(z|A) = Acxp(— Az). 
Using (O, we have the complete-data log-Hkehhood of A 

n 

logL^(A)-^(logA-Az,), 

where the pdf of Zi is given by 

/ l^^ Aexp(— Az) 

exp(— Aoi) — exp(— Aoi) 

for at < z < hi. When Ui — bi, the above random variable Zt degenerates at Zi = ai. 

• E-step: 

The Q{-) function is given by 



g(A|A(^))=nlogA-A^Af\ 



where for a^ < bi 

l-b 



A^^ ^ EiZ.lM^^] ^ /V,.,(.|A(-)).. ^ a.exp( A(-)a ) - MxpM(j)M ^ 1 _ 
Ja, c-xp{ - X(^ > tti ) -ex.p[-X('' I bi) Xy^) 

When ai = bi, we have A]'^' = ai. 

• M-step: 

Differentiating Q(A|A(''^) with respect to A and setting this to zero, we obtain 

aQ(A|A(-)) _ n ^As)_. 

Solving for A, we obtain the (s -\- l)st EM sequence in the M-step 

^(,+1) ^ n ^ 

2^1=1 ^i 

If we instead use the MCEM (or QEM) algorithm by simulating (or quantiling) zi , . . . , 2;„ from 
the truncated normal distribution p(z|0'^''), we then obtain the MCEM (or QEM) sequences 

;^(s+l) 



where Zi^k for k — \,2, . . . ,K are from the truncated exponential distribution ^^^(zlA'*-') defined 
in ®. 
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It is of interest to consider the case where the data are right-censored. In this special case, the 
closed-form MLE is known. If the data are fully observed (i.e., Wi = [0^,0^]) for i — l,2,...,r, 
it is easily seen from I'Hospital rule that A^ = a,;. If the observation is right-censored (i.e., 
Wi — [ai,oo]) ioT i = r + 1 , . . . , n, we have Al'^' = a^ + \/\'^^\ Substituting these results into ([TU|) 
leads to 

A(^+i) = - fin 

Note that solving the stationary-point equation A = A'^^^^ — A*^"*' of PT|) gives 

A 



En 

As expected, this result is the same as the well-known closed-form MLE with the right-censored 
data. 

5.2 Normal Distribution 

We assume that Zi are iid normal random variables with parameter vector 6 = (/i, a). Then the 
complete-data log-likelihood is 

_. n n 

\ogL-{e) ex -^ loga^ - ^^2 _ —{Y^zj - 2m^z,}, 

4=1 i=l 

where the pdf of Zi is given by 

p..(z|M,a)^ ;^^^^^. (12) 

for ai < z <bi. Similarly as before, if ai — bi, then the random variable Zi degenerates at Zi — Ui. 

• E-step: 

Denote the estimate of 9 at the s-th EM sequence by 9^'^' = (/i(*',cr'^ ''•'). Ignoring constant 
terms, we have 

_, n n 



2a2^ 2a2— ^ a- 
where A\''' = E[Zf\9'-''^] and b|'^ = E[Z^\9'''^]. Using the following integral identities 



-n )dz = Ai$( ) - a(l){ ), 

a a a a 

'l^C-^^)dz ^ (^2 + ^2)<j,(i^) _ ^(^ ^ .)<^(^^), 
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we have for a, < 6, 






When Ui — bi, we have A^ = a^ and Bl — ai. 

• M-step: 

Differentiating the expected log-hkehhood Q{0\0^^') with respect to ^ and a^ and solving 
for /x and cr^, we obtain the EM sequences 



^'^^'^ = ^E^i^^ (13) 

1=1 

^2(^+1) ^i^^(^)_|^(s+l)|2^ (14) 

i=l 

If we instead use the MCEM (or QEM) algorithm by simulating (or quantiling) zi , . . . , z„ from 
the truncated normal distribution p(z|0'^''), we then obtain the MCEM (or QEM) sequences 

M^^^'' = -E^E-^^' (15) 

i=i k=i 



2(s+l) 1 v^ 1 



n K 

E^E-^,.-K"^n^ (16) 



n^^ K 

i=l k=l 



where Zi^k are from the truncated normal distribution ^^.(zifcl/i'^''', cr'-*') defined in (J12p . 

5.3 Laplace Distribution 

We assume that Zi are iid Laplace random variables with parameter 6 = (fj,, a) whose pdf is given 

by 

f{x\fi, cr = — exp 

2(7 V cr 

Using ([5]), we have the complete-data log- likelihood 

logL=(0|z) =C-nlogcr VIj/i-mI E l^«-/^l' 

where the pdf of Zi is given by 



a ^ — ' a 

z— 1 i— m+1 



for ai < z < bi. Similarly as before, if Oj = 6^, then the random variable Zi degenerates at Zi — ai. 
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• E-step: 

At the s-th step in the EM sequence denoted by 0^"^' — (/i(^\ cr'^*^), we have the expected 
log-hkehhood 



= f\ogL''{e\z)p{z\e^'^)dz 



1=1 ' 

The computation of the above integration part is very complex. We can overcome this 
difficulty by using MCEM (or QEM) approach. The approximate expected log-likelihood is 

K 



= lX^logi-(0|zW) 



K 

k=\ 



f A 1 "^ 



= C-nloga--^-^|z,,fe-M|, 
i=\ k=i 

where z'*^' = (^i.fe, ^2,fc, • ■ • , -Zn.fc), and Zi^k for k — 1,2, . . . , K are from pzi{z\6^') defined in 

• M-step: 

Then we have the MCEM (or QEM) sequences 

^(^^+1) =median(z(i),...,z(-^)), (18) 



a 



't^th^~^''^''\- (19) 



i=i fc=i 



5.4 Rayleigh Distribution 



Let Zi be iid Rayleigh random variables with parameter /3 whose pdf is given by 
Then the complete-data log-likelihood is 



/(z|/3) = ^exp(-— ), 2>0, /3>0. 



n ^ n 

logL^(/3|z)=C-2nlog/3 + ^logz.-^^2f, 
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where the pdf of Zi is given by 

VzMf^)^ — -, — —. -. — W\ (20) 

for Oi < z < bi. Siniilarly as before, if a^ — bi, then the random variable Zi degenerates at Zi — ai. 



• E-step: 

At the s-th step in the EM sequence denoted by /3^*' , we have the expected log-hkehhood 



Q(/3|/3^^^) 






^ j\ogL 


^(/3|z)p(z|/3(^ 


))dz 




n 


rh 



UJa. 2/32 

The calculation of the above integration part does not have a closed form. Using the MCEM, 
we have the approximate expected log-likelihood 

= -if^logL^(/3|zW) 

^ K n ^ ^ K n 

= C-2nlog/?+-5]5]logz..fc-— -^^z^fe, 

fc=l i=l ^ fe=l i=l 

where z^*^) = (zi,*;, . . . , Zn.k) and Zi^/c for A; = 1,2, ... ,K are from Pzi(z|/5*-*-') defined in ^U\j. 



• M-step: 

We then have the following MCEM (or QEM) sequences by differentiating (5(/3|/3(^)): 



P^' 



s+l) 



\ 



n K 

i=l fc=l 



5.5 Weibull Distribution 



We assume that Xi is iid Weibull random variables with the pdf and cdf of Xj given by f{x) 
X(3x^~^ exp{~Xx^) and F{x) ~ 1 — exp{—Xx^), respectively. 

Using (O, we obtain the complete-data log-likelihood oi 9 — (A, /3): 

n 

\ogL-{e) - ^ { log A + log/3 + (/? - 1) logz, - Azf }, 
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where the pdf of Zi is given by 



v.MQ) 



A/Sz/'-i exp(-Az'5) 



(22) 



exp(-Aaf)-exp(-A6,f)' 
for Ui < z < bi. Similarly as before, if Oj — bi, then the random variable Zi degenerates at Zi = a 



• E-step: 

Denote the estimate of 6 at the s-th EM sequence by 6^^' = (A^^^^/S*^ ''-'). It follows from 
Q{e\e'^''^)^E[logL%9)] that 



g(0|0(^)) = nlogA + nlog/3+(/3-l)^A|^)-A^Bf\ 



4=1 i=l 



where Al'> = ^[logZ,|6>(^)] and Sf ■> = E[Z^\e^'''>] . 

• M-step: 

Differentiating (3(A|A'^'*^) with respect to A and /3 and setting this to zero, we obtain 



^«^. ^-5: «<•.„)=„. 



d\ A 



i=l 



aQ(0|0(^))_n , ^ .,) ^^aB,p'(/3) 



a/3 /3 



-^E^i^^-^E' 



9/3 



= 0. 



Arranging for /3, we have the equation of /3 

15 + -} ^ 



n as,<'''(^) 



p n 



0. 



The (s + l)st EM sequence of (3 is the solution of the above equation. After finding (3^'^^^\ 
we obtain the (s + l)st EM sequence of A'^^^-* 



E:iii?r'(/3(^+^') 



In this Weibull case, it is extremely difficult or may be impossible to find the explicit expec- 
tations of -E[logZi|0^''''] and i?[zf j^^^-*] in the E-step, but the quantile function of the random 
variable Zi at the s-th step can be easily obtained. From 1^^, we have 



q.M^F^'iCkie^'^^) 



^ log {(1 - Cfc) exp(-A(^)af '° V a exp(-A(^)6f' )} 



Using the above quantiles, we obtain the following QEM algorithm. 
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• E-step: 

Denote the quantile approximation of Q{-) by Q(-). Then, we have 

n ^ K n ^ K 

Q(0|0(^))-nlogA + nlog/3+(/3-l)^-^logg,,fe-A^-5]gf,. 

i=l k=l j=l k=l 

• M-step: 

Differentiating (5(A|A'^'*^) with respect to A and /3 and setting this to zero, we obtain 

dQ{e\e^_n _ 1 .^.f p 

dX 'A K^^ '^^''^ " "' 
i=i fc=i 

^^^ = « + IF Z^ Z^ log*.fc " ^1^ Z^ Z^ <fc log *.^ = 0. 



'^ '^ i=l fc=l i=l fc=l 

Arranging for /3, we have the equation of /? 

^ + ^Y.Y.'^M^^- K , -0. (23) 

The (s + l)st QEM sequence of /3 is the solution of the above equation. After finding /J^'^+i) , 
we obtain the (s + l)st QEM sequence of A'^^^^-' 

nK 



A(^ 



2^i=l 2^fc=l 9i,fc 



In the M-step, we need to estimate the shape parameter (3 by numerically solving (j23p . but this 
is only a one-dimensional root search and the uniqueness of this solution is guaranteed. Lower 
and upper bounds for the root are explicitly obtained, so with these bounds we can find the root 
easily. We provide the proof of the uniqueness under quite reasonable conditions and give lower 
and upper bounds of /? in the Appendix. 



6 Simulation Study 

In order to examine the performance of the proposed method, we use the Monte Carlo simulations 
with 5000 replications. We present the performance of this new method with the EM and MCEM 
estimators by comparing their estimated biases and the mean square errors (MSE). The biases are 
calculated by the sample average (over 5000 replications) of the differences between the method 
under consideration and the MLE. The MSE are also obtained by the sample variance of the 
differences between the method under consideration and the MLE. 
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Table 1: Estimated biases and MSE, and SRE of estimators under consideration with normal data. 



Location 
Estimator 



Bias 



MSE 



SRE 



EM 

MCEM 
K = 10 
K = 10^ 
K = 10^ 
K = 10'' 

QEM 
K = 10 
K = 10^ 
K = 10^ 
K = 10'' 



1.342988x10" 



7.169276x10" 
2.223300x10" 
7.135135x10" 
2.265630x10" 



2.511190x10"^ 
2.382535x10"^ 
2.349116x10"" 
3.232357x10"^ 



1.779955x10" 



8.381887x10" 



2.123573x10"® 
8.170053x10"* 2.178633 xlO"'^ 
8.417492x10"^ 2.114590x10"*^ 
8.433365x10"** 2.110610x10"^^ 



2.558357x10"^ 6.957412x10"*^ 

2.305853x10"^ 7.719289x10"" 

2.432084x10"'' 7.318639x10"^ 

2.176558x10"^** 8.177841 xlO"^ 



Scale 
Estimator 



Bias 



a 
MSE 



SRE 



EM 

MCEM 
A"= 10 
K = 10^ 
K = 10^ 
K = 10* 

QEM 
K = 10 
K 
K 
K 



10^ 
10^ 
10* 



3.706033x10" 



1.139404x10" 
3.540433x10" 
1.137090x10" 
3.621140x10" 



5.580507x10"^ 
5.890585x10"^ 
6.315578x10"* 
9.699447x10"^ 



1.143094x10" 



2.133204x10" 
2.069714x10" 
2.130881x10" 



5.358577x10"'' 
5.522955x10"® 
5.364419x10"^^ 
2.150602x10"^ 5.315227x10"® 



6 



1.272262x10" 

1.418158x10" 

1.644996x10"® 

4.529568x10"^° 



8.984739x10" 
8.060413x10" 
6.948917x10" 
2.523627x10" 



First, a random sample of size n = 20 was drawn from an normal distribution with /i = 50 
and a — ^ and the largest five have been right-censored. All the algorithms were stopped after 10 
iterations (s = 10). The results are presented in Table [T] To help compare the MSE, we also find 
the simulated relative efficiency (SRE) which is defined as 

MSE of the EM estimator 
MSE of the estimator under consideration 

From the result, the EM is as efficient as the MLE (the MSE of the EM is almost zero). Compared 
to the MCEM, the QEM has much smaller MSE and much higher efficiency. For example with 
K = 10*, the SRE of the MCEM is only 2.110610 x lO^^ for ^ and 5.315227 x 10"*^ for a. On the 
other hand, the SRE of the QEM is 0.8177841 for fi and 0.2523627 for a. Comparing the resuhs 
in Table [H the QEM with only K = 100 performs better than the MCEM with K = 10, 000. 

Next, we draw a random sample of size n = 20 from a Rayleigh distribution with 13 — \Q with 
the largest five being right-censored. We compare the QEM only with the MCEM because the EM 
is not available. The results are presented in Table [2] for Rayleigh data. The results also show that 
the QEM clearly outperforms the MCEM. 
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Table 2: Estimated biases, MSE, and SRE of estimators under consideration with Rayleigh data. 



Estimator 



Bias 



MSE 



MCEM 
K = 10 
K = W 
K = W 
K = 10^ 
QEM 
K = 10 
K = 102 
K = 10^ 
K = 10^ 



0.1457421790 3.419463x10-2 

0.0450846748 3.260372x10"^ 

0.0142957976 3.283920 xlO"'* 

0.0045336330 3.269005x10"^ 

0.0560471712 5.554717x10"^ 

0.0057055322 5.739379x10"^ 

0.0005675668 5.517117x10-9 

0.0000527132 3.759086x10"" 



7 Examples of Application of the Proposed Methods 

This section provides four numerical examples of parameter estimation for a variety of distributions 
using the EM (if available), MCEM, and QEM algorithms. 

7.1 Censored Normal Sample 



Let us consider the data presented earlier bv iGuptal (jl952[ ) in which, out of iV = 10, the largest 
three observations have been censored. The Type-II right-censored observations are as follows: 

1.613, 1.644, 1.663, 1.732, 1.740, 1.763, 1.778. 

The MLE is /I = 1.742 and a = 0.079. 

Table 3: Iterations of the EM, MCEM, and QEM sequences with normal data. 







^^=' 






a(=) 




s 


EM 


MCEM 


QEM 


EM 


MCEM 


QEM 














1 


1 


1 


1 


1.8467 


1.8456 


1.8467 


0.2968 


0.2973 


0.2966 


2 


1.8058 


1.8074 


1.8057 


0.1931 


0.1959 


0.1930 


3 


1.7761 


1.7771 


1.7760 


0.1370 


0.1386 


0.1369 


4 


1.7593 


1.7597 


1.7593 


0.1070 


0.1076 


0.1069 


5 


1.7504 


1.7503 


1.7503 


0.0919 


0.0919 


0.0919 


6 


1.7459 


1.7458 


1.7459 


0.0848 


0.0847 


0.0848 


7 


1.7439 


1.7440 


1.7439 


0.0816 


0.0816 


0.0816 


8 


1.7429 


1.7428 


1.7429 


0.0802 


0.0802 


0.0802 


9 


1.7425 


1.7422 


1.7425 


0.0796 


0.0792 


0.0796 


10 


1.7424 


1.7421 


1.7424 


0.0793 


0.0789 


0.0793 
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Table 4: Iterations of the EM, MCEM and QEM sequences with Laplace data. 





M^ 


s) 


a 


s) 


s 


MCEM 


QEM 


MCEM 


QEM 











1 


1 


1 


49.76609 


49.76609 


4.320983 


4.318817 


2 


49.76609 


49.76609 


4.669010 


4.650584 


3 


49.76609 


49.76609 


4.669581 


4.683749 


4 


49.76609 


49.76609 


4.682357 


4.687064 


5 


49.76609 


49.76609 


4.693247 


4.687395 


6 


49.76609 


49.76609 


4.687793 


4.687429 


7 


49.76609 


49.76609 


4.693793 


4.687432 


8 


49.76609 


49.76609 


4.678954 


4.687432 


9 


49.76609 


49.76609 


4.702827 


4.687432 


10 


49.76609 


49.76609 


4.671909 


4.687432 



We use the EM sequences from (|13p and (|14p to compare with the MLE. Starting values are 
chosen by selecting arbitrary number (for example, /i^°-' = and a^ =1)- We obtain the same 
result as the MLE up to the third decimal point after around nine iterations. 

Next, using the MCEM sequences from ^ and ^, we obtained the MCEM and QEM 
estimates. The algorithm was run with K = 1, 000 for 10 iterations. 

Table [3] presents the iterations of the EM, MCEM, and QEM sequences for this problem. From 
the table, we can see that, when compared with the MCEM estimate, the QEM estimate is closer 
to the MLE and the EM estimate. 



7.2 Censored Laplace Sample 



Let us consider the data presented earlier by 



BalakrishnanI (|l996l ) in which, out of iV = 20 obser- 



vations, the largest two have been censored. The Type-II right-censored sample thus obtained is 
as follows: 

32.00692, 37.75687, 43.84736, 46.26761, 46.90651, 47.26220, 47.28952, 47.59391, 48.06508, 
49.25429, 50.27790, 50.48675, 50.66167, 53.33585, 53.49258, 53.56681, 53.98112, 54.94154. 



In this case, iBalakrishnanI (jl996[) computed the best linear unbiased estimates (BLUE) of /j, 
and (J to he fi = 49.56095 and <t = 4.81270. The MLE is fi = 49.76609 and & ^ 4.68761. 



We use the MCEM sequences from ^^ and ((191) for the MCEM and QEM estimates. The 
algorithms were run with K = 1,000 for 10 iterations with the starting value (/i^^-' = and 
ffi^) — 1). Table [H presents the iterations of the MCEM and QEM sequences. When compared 
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Table 5: Iterations of the MCEM with Rayleigh data. 





pis) 


P(s, 


s 


MCEM 


QEM 


MCEM 


QEM 





1 


1 


10 


10 


1 


5.3363 


5.3358 


7.3335 


7.2946 


2 


5.9395 


5.9444 


6.4458 


6.4435 


3 


6.0888 


6.0870 


6.2167 


6.2126 


4 


6.1170 


6.1221 


6.1488 


6.1536 


5 


6.1413 


6.1309 


6.1494 


6.1387 


6 


6.1336 


6.1330 


6.1356 


6.1350 


7 


6.1214 


6.1336 


6.1219 


6.1341 


8 


6.1290 


6.1337 


6.1291 


6.1338 


9 


6.1261 


6.1338 


6.1261 


6.1338 


10 


6.1292 


6.1338 


6.1292 


6.1338 



to the MCEM estimate, especially for a, the QEM estimate is closer to the MLE. Note also that 
both MCEM and QEM estimates are closer to the MLE than the BLUE. 

7.3 Censored Rayleigh Sample 

We simulated a data set with /3 = 5 in which, out of TV = 20 observations, the largest five have 
been censored. The Type-II right censored sample thus obtained is as follows: 

L950, 2.295, 4.282, 4.339, 4.411, 4.460, 4.699, 5.319, 
5.440, 5.777, 7.485, 7.620, 8.181, 8.443, 10.627. 

We use the MCEM sequences from ([IT]) for the MCEM and QEM estimates. The algorithms 
were run with K — 1, 000 for 10 iterations with two different starting values (/3*-°^ — 1, 10). Table [S] 
presents the iterations of the MCEM and QEM sequences. These iteration sequences show that 
the QEM converges very fast. The MLE is /3 = 6.1341. The QEM sequences (after rounding) are 
the same as the MLE up to third decimal place after the sixth iteration. 

7.4 Weibull Interval Data 

The previous examples have indicated that the QEM algorithm outperforms the MCEM. In this 
example, we consider a real-data exa mple of intermittent inspection. The data in this example were 



originally provided bv lNelsonI (|1982[ ). The parts were intermittently inspected to obtain the number 
of cracked parts in each interval. The data from intermittent intermittent inspection are referred 
to as grouped data which provide only the number of failures in each inspection period. Table |6] 



23 



Table 6: Observed frequencies of intermittent inspection data. 



Inspection 


Observed 


time 


failures 


0- 


- 6.12 


5 


6.12- 


- 19.92 


16 


19.92 r. 


- 29.64 


12 


29.64 - 


- 35.40 


18 


35.40 - 


- 39.72 


18 


39.72 - 


- 45.24 


2 


45.24 - 


- 52.32 


6 


52.32 - 


- 63.48 


17 


63.48 - 


^ 


73 



gives t he data on cracked p arts. Other examples of g roupe d and c e nsore d are in ISeo and Yum 
(119931 ) . Ishapiro and Gulatil (|l998f ). lxiong and Jil |2004l) . and lMeekeJ |l986l) . These grouped data 
can be regarded as interval data. Thus, the proposed method is easily applicable to this case. 

The QEM estimate under the exponential model is A = 0.01209699 while QEM estimate under 
the WeibuU model is A = 0.001674018 and /3 = 1.497657. We used Ao = 1 and /3o = 1 for starting 
values and e = 10^^ for stopping criterion for the QEM algorithm. 



8 Concluding Remarks 

In this paper, we have shown that the QEM algorithm offers clear advantages over the MCEM. 
It reduces the computational burden required when using the MCEM because a much smaller 
size is required. Unlike the MCEM, the QEM also possesses very stable convergence properties 
at each step. The QEM algorithm provides a flexible and useful alternative when one is faced 
with a difficulty with the implementation of the ordinary EM algorithm. A variety of examples of 
application were also illustrated using the proposed method. 
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Appendix: Sketch Proof of the Uniqueness and Bounds of 
the Weibull Shape Parameter 



Analogous to the approach of iFarnum and BoothI (|l997l ). the uniqueness of the solution of (l^5|) 



can be proved as follows. For convenience, letting 

Z^i=l Z^k=l %,k i=l fe=l 

we rewrite (P^ by g{(3) — h{(3). The function g{/3) is strictly decreasing from oo to on /3 g [0, oo] 
while h{P) is increasing because it follows from the Jensen's inequality that 

K n K n K 



dh{l3) 1 



Yl Yl lik log' %,fe H Z! '?f fc - { im lik log i^-.k} 



i=l k=l i=l fe=l i=l fe=l 



>0. 



9^ {Er=iEf=i<j^ 

Now, it suffices to show that h{/3) > for some /3. Since 

n K 

lirn^ h{P) = ^X! Y { log^max - log Qi^kj, 

~^°° i=l k=l 

where (/max = maxi^/c {qi.kj, we have h{f3) > for some /3 unless qi^k = Qmax for all i and k. This 
condition is extremely unrealistic in practice. 

Next, we provide upper and lower bounds of (3. These bounds guarantee the unique solution in 
the interval and enable the root search algorithm to find the solution very stably and easily. Since 
/i(/3) is increasing, we have g{f3) < lim^^oo h{/3), that is, 



Er=l EfcLl (log 9max - log qi,k) 



Denote the above lower bound by Pl- Then, since h{(3) is again increasing, we have g{(3) = h{(3) > 
h{f3L), which leads to 



KPlY 
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